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LOCALLY ADAPTIVE DYNAMIC NETWORKS 

By Daniele Durante* and David B. Dunson^ 
University of Padova* and Duke University* 

Our focus is on realistically modeling and forecasting dynamic 
networks of face-to-face contacts among individuals. Important as¬ 
pects of such data that lead to problems with current methods in¬ 
clude the tendency of the contacts to move between periods of slow 
and rapid changes, and the dynamic heterogeneity in the actors’ con¬ 
nectivity behaviors. Motivated by this application, we develop a novel 
method for Locally Adaptive DYnamic (LADY) network inference. 
The proposed model relies on a dynamic latent space representation 
in which each actor’s position evolves in time via stochastic differen¬ 
tial equations. Using a state space representation for these stochastic 
processes and Polya-gamma data augmentation, we develop an effi¬ 
cient MCMC algorithm for posterior inference along with tractable 
procedures for online updating and forecasting of future networks. We 
evaluate performance in simulation studies, and consider an applica¬ 
tion to face-to-face contacts among individuals in a primary school. 


1. Introduction. We are interested in studying face-to-face dynamic interactions among 
individuals in a primary school. Understanding key aspects of these time-varying interaction 
networks and forecasting of future contacts is interesting sociologically and important in 
infectious disease epidemiology. As illustrated in Figure 1, data consist of a sequence of V x V 
time-varying symmetric adjacency matrices Y tl ,..., Y tn having entries Y t .[ vu ] = Y t .{ uv \ = 1 if a 
face-to-face contact has been recorded between actors v = 2,..., V and it = 1,, v—1 at time 
ti = ti,...,t n , and Y t .i vu i = Y t . r uv i = 0 if no contact has been observed. These undirected 
dynamic networks are available at http://www.sociopatterns.org; see also Stehle et al. 
[2011] and Gemmetto, Barrat and Cattuto [2014] for additional details. 

The increasing availability of new sensing devices and wearable sensors to trace human in¬ 
teraction behaviors allows a growing access to these type of dynamic networks, while opening 
new avenues for studying underlying patterns in social interactions and how these processes 
relate to associated dynamic systems such as epidemic spreading. Recent studies have inves¬ 
tigated dynamic face-to-face human interactions in several environments. Isella et al. [2011] 
focus on contact dynamics among individuals in two different scenarios, covering a scientific 
conference and a long-running museum exhibition, respectively. Vanhems et al. [2013] study 
interactions among staff members and patients in a hospital. Stehle et al. [2011], Gemmetto, 
Barrat and Cattuto [2014] and Fournet and Barrat [2014], Mastrandrea, Fournet and Barrat 
[2015] investigate face-to-face contact dynamics among students in primary and high schools, 
respectively; refer also to Barrat and Cattuto [2013] for a review. 

The above studies mostly focus on descriptive analyses in order to provide a summarized 
overview of the topological structures underlying the observed networks and how these mea¬ 
sures relate to environmental conditions and other variables. Wyatt, Choudhury and Bilmes 
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Fig 1. For selected times, adjacency matrices representing the observed face-to-face contact networks. 


[2008] analyze, instead, face-to-face contacts from a model-based perspective, aggregating the 
dynamic interactions into a single network measuring duration of contact. Although these 
procedures provide valuable insights, flexible statistical models of how the human interaction 
networks evolve in time would provide improved ability to jointly infer dynamic changes in 
network structures, while accounting for uncertainty. In addition, such models would be useful 
in terms of prediction and forecasting of contacts, which is of key interest in epidemiology. 

There is a rich literature on dynamic networks, but several aspects of our motivating 
application require careful innovation. These aspects include the tendency of the contacts to 
move between periods of slow and rapid changes, the dynamic heterogeneity in the actors’ 
connectivity behaviors, and the need for fast and accurate online updating and forecasting 
procedures for timely prediction of future interactions and design of appropriate outbreak 
prevention policies. In addressing these goals, it is fundamental to borrow information within 
each network and across time, while scaling to a larger number of time points t\,... ,t n and 
to a moderately large set of actors, without affecting flexibility in modeling dynamic contacts. 

1.1. Dynamic face-to-face human contact networks. We focus on the face-to-face dynamic 
interaction networks among individuals in a primary school in Lyon, France. Raw contact 
data are available for 232 students between 6 and 12 years of age and 10 teachers, during two 
consecutive school days running from ~ 08:40 to ~ 17:10. The primary school is characterized 
by 5 grades, each divided into two classes comprising on average 24 students. 

Face-to-face contact data are monitored via wearable radio frequency identification devices 
(RFID), exchanging low-power radio packets when two individuals face each other at a dis¬ 
tance of ~ 1 — 1.5 meters. This proximity range is chosen to represent a reasonable proxy of 
close social contact, while indicating a potential occasion of disease transmission [Stehle et al. 
2011]. Raw data are available for consecutive windows of 20 seconds and encode which pairs 
of individuals had a face-to-face contact in each one of these time intervals; refer to Cattuto 
et al. [2010] for a description of RFID proximity-sensing devices. 

Initial descriptive analyses of these data highlight a very sparse and noisy structure with 
only 40 contacts — among the 29,161 possible — monitored on average for every window 
of 20 seconds. This time scale might be too narrow to highlight recurring patterns in the 
dynamic evolution of the underlying network topological structures. Hence, we aggregate the 
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Fig 2. Time-varying observed network summary measures for the first day of school. Upper panels: global 
network measures. Bottom panels: degree of selected actors. 


data in consecutive time windows of 10 minutes so that the resulting networks encode which 
pairs of individuals established at least one face-to-face proximity contact during each one of 
these consecutive 10 minutes time intervals. Focusing on binary connections instead of the 
cumulative number of contacts in the 10 minutes time windows provides a simpler starting 
point. Moreover, under an epidemiological perspective, at least one proximity contact of 20 
seconds may be sufficient for disease transmission. Although we loose short scale dynamics, 
these windows are sufficiently wide to highlight longer range patterns in the network topology, 
but maintain enough granularity to capture sharp changes which may occur in correspondence 
of breaks, lunch times and school hours. We found these underlying structures quite robust to 
moderate changes in the length of the time intervals, including 5, 15 and 20 minutes. Stehle 
et al. [2011] consider a similar aggregation to study dynamic changes in the averaged degree. 

In analyzing these data, we seek inference and forecasting procedures which are sufficiently 
flexible to capture different types of dynamic changes in the network data. Dynamic variations 
in connectivity patterns may be influenced by the underlying endogenous architectures as well 
as exogenous factors, such as changing spatial environments and class or gender homophily. 
Information on class membership and gender are available for all the individuals — except 
teachers — while approximate changes in spatio-temporal locations are provided for 5 classes 
out of 10 in Figure 10 of Stehle et al. [2011]. We focus on the students and teachers in these 
5 classes — for a total of V = 120 actors — and provide inference on the data from the first 
day Y tl ,..., Y tn , while using networks Y t *,..., Y* n in the second day to evaluate out-of-sample 
predictive performance. In analyzing these data, we develop a flexible dynamic latent space 
model relaxing the complex dependence structure within the network to one of conditional 
independence between contacts among pairs of actors given their positions in a latent space. 
Therefore, focusing on a subset of individuals of interest, instead of modeling the network of 
contacts in the entire school, does not contradict the assumptions of our statistical model. 

As shown in Figure 2, the trajectories of the global and actor-specific summary measures 
cycle irregularly between phases characterized by slower and more rapid variations. Flexibly 
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capturing such behavior is important to improve prediction and investigate how dynamic 
face-to-face interactions relate to specific events, such as school hours, breaks, lunch time 
and changing environments. Instead of directly including covariate information on gender, 
class membership and spatial locations in the dynamic model, we use these information to 
assess the extent to which our model can learn known structures in the data. Current models 
for dynamic networks typically rely on homogeneity and stationarity assumptions — for both 
endogenous and exogenous effects — and hence have difficulties in modeling variations over 
time of specific network structures. This can have a strong effect on the quality of inferences 
and predictions, with under-smoothing during periods of stable contacts and over-smoothing 
across times of rapid variations. Motivated by our face-to-face contact network data and by 
the need for flexible methods enforcing time-varying smoothness in dynamic networks, we 
develop a Locally Adaptive DYnamic (LADY) network model that characterizes the time- 
varying edge probabilities via latent processes, which have time-varying smoothness. 

1.2. Relevant literature. There is a growing literature on statistical modeling of dynamic 
networks. Much of the literature focuses on the case in which the exact time of each contact 
event is observed; see for example Butts [2008] and DuBois et al. [2013]. We instead consider 
the case in which snapshots of a dynamic network are observed at different time points. 

A popular class of procedures generalizes exponential random graph models (ERGMs) 
to include discrete time Markov dynamics [Robins and Pattison 2001, Hanneke, Fu and 
Xing 2010, Krivitsky and Handcock 2014]. Holland and Leinhardt [1977], and the subsequent 
improvements of Snijders [2001 2005] and Snijders, van de Bunt and Steglich [2010] provide 
alternative continuous time Markov specifications allowing the rate of change in the network 
to vary with time. These models are elegant and allow dynamic inference on several network 
characteristics. However, current specifications do not fully accommodate non-homogeneous 
dynamics and heterogenous contact patterns, which are key aspects of our data. 

There is also a rich literature on alternatives to ERGMs, including stochastic block models 
[Fienberg and Wasserman 1981, Nowicki and Snijders 2001], mixed membership stochastic 
block models [Airoldi et al. 2008] and latent space models [Hoff, Raftery and Handcock 2002]. 
These methods characterize the edges as conditionally independent Bernoulli variables given 
their corresponding edge probabilities, with these probabilities further defined as a func¬ 
tion of actor-specific latent variables. As highlighted in Hunter, Krivitsky and Schweinberger 
[2012], building on conditional independence provides computational benefits in facilitating 
implementation of MCMC methods. Moreover, although — differently from dynamic ERGMs 
- these procedures do not explicitly parameterize inter-dependence between relations, the 
shared dependence on a common set of actor-specific latent variables can induce flexible 
dependence structures and allow for across-actor heterogeneity in dynamic contacts. 

Dynamic stochastic block models have been considered by Yang et al. [2009 2010] and 
later refined by Xu and Hero [2014] and Xu [2015]. These approaches are specifically tailored 
for learning dynamic changes in shared connectivity behaviors, and may fail to accurately 
characterize and predict contact patterns different than those arising from block structures. 
Dynamic relational feature models [Foulds et al. 2011] improve flexibility by replacing the 
single block membership indicator with vectors describing presence or absence of features for 
each actor, but assume a time-constant features-interaction matrix restricting the dynamics. 

Dynamic mixed membership stochastic block models [Xing, Fu and Song 2010] and latent 
space models [Sarkar and Moore 2005, Sewell and Chen 2015] are more flexible. Typical 
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approaches incorporate dynamics through state space models, Markov processes and random 
walk trajectories. To address computational intractability, approximations are used including 
the extended Kalman filter and variational approaches. Durante and Dunson [2014] embed 
the actors in a latent space and allow their coordinates to evolve in continuous time via 
Gaussian processes (GP). Their approach provides a simple Gibbs sampling algorithm, but 
faces the usual computational bottlenecks of GPs in scaling to a large numbers of time points, 
and the dynamic network inherits the stationary dependence structure of the latent GPs. 

Outside of the network field, there are several approaches to incorporate locally adaptive 
smoothness in dynamic processes. Particularly relevant to our work is the nested Gaussian 
process (nGP) proposed by Zhu and Dunson [2013] for regression and extended to multivariate 
time series by Durante, Scarpa and Dunson [2014], The nGP models the trajectories’ mth 
order derivatives via GPs, which are in turn centered on a higher level GP instantaneous mean 
that favors time-varying smoothness. Similar constructions are lacking in network fields. 

Our LADY network model accounts for across-actor heterogeneity via a latent space for¬ 
mulation with nGPs to induce time variations in the rate of change of the network structure. 
By considering a state space representation for the latent processes, we reduce the compu¬ 
tational burden from cubic in the number of time points to linear, while developing simple 
procedures for forecasting, prediction and online updating appropriate to streaming networks. 
In Section 2 we describe the LADY network model. Posterior computation procedures are 
provided in Section 3 with an additional focus on forecasting and online updating. In Section 
4 we consider a simulation study to test our methods in relation to existing dynamic network 
models. Section 5 presents the results for our analysis of face-to-face contact network data. 

2. LADY networks. We assume that the observed data Yt 1; ... ,Yt. n provide a realiza¬ 
tion — on a discrete time grid t \,..., t n — of the continuous latent process {y(t) : t E T C 
3i + } and seek a representation for the generative mechanism associated with this network¬ 
valued stochastic process, which is consistent with the goals discussed in Section 1. As our 
contact networks are undirected, it is sufficient to model the lower triangular elements of y(t) 
since y vu (t ) = y uv (t) for every v = 2,..., V, u = 1,..., v — 1 and t E T. In particular, we let 

(2.1) y vu (t) | 7 T vu (t) ~ Bern {ir vu (t)} , 
for every v = 2,..., V, u = 1,..., v — 1 and t E T, with 

(2.2) n vu (t) = [l + exp{—n(t) - x v (t) T x u (t)}] \ 

Under (2.1) the edges y vu (t) E {0,1} are conditionally independent Bernoulli random vari¬ 
ables given their corresponding edge probabilities n vu (t) E (0,1). These edge probabilities are 
obtained by mapping a latent similarity measure s vu (t) = /r(t) + x v (t.) T x u (t) from to (0,1) 
according to (2.2), where fi(t ) represents a baseline similarity score shared by all the edges at 
time t, whereas x v (t) = {a;„i(f), • • •, x v H(t)} T E 3? H and x u (t) = {x ul (t), ..., x uH (t)} T E $t H 
are the vectors of latent coordinates for actors v and u, respectively, at time t. Based on (2.2), 
the probability of an edge between actors v and u at time t increases with x v (t) T x u (t) E 3i. 

Our dynamic latent space representation collapses higher-order dependencies into a lower¬ 
dimensional space, reducing dimensionality from V(V — l)/2 stochastic processes on the edge 
probabilities to V x H — typically H <C V — latent trajectories and one baseline process. 
This construction recalls the statistic eigenmodel in Hoff [2008] , which provides a flexible class 
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of latent variable constructions for social networks allowing across-actor heterogeneity and 
accommodating various topological properties. Following Hoff [2008], model (2.1)-(2.2) gen¬ 
eralizes stochastic block models [Fienberg and Wasserman 1981, Nowicki and Snijders 2001] 
and latent distance models [Hoff, Raftery and Handcock 2002], and can accommodate block 
structures, homophily, and transitive contact patterns. These properties are — potentially — 
key factors underlying our face-to-face interaction data. For instance, during school hours or 
lunch times, the contact networks may exhibit block structures due to shared environments 
by students belonging to the same class or group of classes. Breaks are instead potentially 
associated with transitive patterns arising from friendship among students in different classes 
or homophily by gender. The low-rank decomposition in (2.2) is not unique. However, we 
avoid identifiability restrictions on the latent coordinates, as they are not required to ensure 
the identifiability of the edge probabilities, which are the focus of prediction and inference. 

In order to complete a representation of the LADY network model, we require priors on 
the stochastic processes : t E T} and {x v h(t) : t E T} for each v = 1,...,V and 

h = 1,..., H. If we define stationary processes, which assume that the correlation between 
the realizations at times t t and tj only depends on the time lag \tj — tj\ , it is straightforward 
to show that the resulting network-valued stochastic process will inherit this stationarity. 
Recalling the discussion in Section 1 and the descriptive analyses in Figure 2, it is necessary to 
accommodate non-stationarity to realistically characterize the dynamics underlying the face- 
to-face interaction data. However, this needs to be done in a careful way to avoid needing to 
estimate many parameters related to non-stationarity and face computational intractability. 
Although there is a rich literature on incorporating non-stationarity in GPs, such models 
tend to be highly challenging to implement even in simpler settings. 

With these issues in mind, we rely on nested GPs [Zhu and Dunson 2013] — rather than 
GPs — to induce highly flexible stochastic processes on {/r(t) : t E T} and {x v h(t) : f £ 1} for 
each v = 1,..., V and h = The nGPs explicitly incorporate time-varying smooth¬ 

ness by defining stochastic differential equations for the function’s derivatives. Focusing on 
the trajectory {x v h(t) : t G T}, the stochastic differential equation representation for the nGP 
can be accurately characterized by the following state equations for {x v h{t) : t E T}, it’s first 
order derivative {x' vh (t) : t E T} and the instantaneous mean {m v h(t) : f £ T} 


(2.3) 
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independently for v = 1,..., V and h = l,...,H, with (rj Xvht i, Vm vh ,i) T ~ N 2 (0, T, vhji ), E vhji = 
diag(<r^ (Tm Si) and Si = ti +1 — U sufficiently small. Similarly for {/i(t) : t E T}, we let 
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where (r/^*, r) Zji ) T ~ N 2 (0, E^j), with 1^,* = diag(<7^, a%6i). 
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The state equations (2.3)-(2.4) along with the observation equations (2.1)-(2.2) induce a 
provably flexible nonlinear logistic state space model for adaptive dynamic network inference, 
which is defined at every time grid t \,..., t n , including unequally spaced ones. Although the 
above state equations can be easily extended to model higher order derivatives for the latent 
trajectories and their local instantaneous means, equations (2.3)—(2.4) prove to be sufficiently 
flexible in inducing adaptive dynamics according to our results. 

There exists other possible methods for accommodating local adaptivity in the latent 
trajectories, such as free knot splines [Friedman 1991]. However, such approaches are com¬ 
putationally intensive due to the unknown numbers and locations of knots [Friedman 1991, 
George and McCulloch 1993]. This creates particular problems for large time grids and appli¬ 
cations requiring lots of changes. Our approach is appealing in providing a simple state space 
representation, which characterizes the latent positions at time tj+i as a first-order stochastic 
Taylor expansion of the same quantities at This choice improves scalability of the infer¬ 
ence procedures, while facilitating implementation of fast and tractable online updating and 
prediction strategies by adapting available techniques for state space models. 


3. Bayesian inference. Let be the prior for {ir vu (t) : v = 2,..., V, u = 1,..., v — 
1, 1 6 T} induced via (2.2) by the state equations (2.3)-(2.4) characterizing the nGP priors 
and n x , respectively. We consider a Bayesian approach for inference to update given 
the observed data Y tl ,..., Y tn . We leverage the Polya-gamma data augmentation for Bayesian 
logistic regression; see Poison, Scott and Windle [2013] for details and Choi and Hobert [2013] 
for theoretical properties. Letting Ti ~ Bern(7Tj) independently, with 7Tj = (1 + e~ x iP)~ l , 
Poison, Scott and Windle [2013] show that conditionally on Polya-gamma augmented data 
Ui ~ PG(l,xl/d), the contribution to the likelihood for the zth observation y* e {0,1} is 


(3.1) 


oc exp 



0.5 )/ Ui 



i = 1,..., n. 


Equation (3.1) is the kernel of a Gaussian distribution for data (y* — 0.5)/wj, with mean xJ/3 
and variance 1/oy. As a result, letting f3 ~ N p (b, B) be the prior for the coefficient vector (3 , 
given Polya-gamma augmented data, the Bayesian logistic regression on data y* can be recast 
in terms of Bayesian linear regression with Gaussian transformed response (y* — 0.5 )/uii. This 
allows a Gibbs algorithm, which alternates between sampling Polya-gamma augmented data 
and updating the coefficient vector /3 from its full conditional Gaussian distribution. 

By exploiting these results, we develop a simple and efficient Gibbs sampler for Bayesian 
inference in our LADY network model. Given Polya-gamma augmented data, our model can 
be recast as a Gaussian state space model for transformed data. By block-sampling in turn 
the latent coordinate processes for each actor v conditionally on the latent positions of the 
others u = 1 ,..., V, we obtain a linear observation equation, which allows us to apply 

standard results from Kalman filtering [Durbin and Koopman 2012], Posterior computation 
alternates between the following steps: 

Step [1]: Update augmented data uo vu (ti) from the full conditional Polya-gamma, co vu (ti ) | 
- ~ PG{1, y(ti) + Yjh = i x vh (ti)x uh (ti)}, for each v = 2,..., V, u = 1,..., v - 1 and 
time U = h,.. .,t n . 

Step [2]: Adapting (3.1) to our model, the likelihood for y = ... y(t n )} T given the 
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Polya-gamma augmented data and the latent coordinate processes is 



^2 Uvu ^ {{Y t . [vu] - 0.5 )/u} vu (ti) - n(U) - x v (ti) T x u {ti)Y 
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x vu\^i) 

E[tm]:u>u U vu(ti ) 


- Ufa) 


with r vu (ti) = Y ti { vu }-0.5-u vu (ti)x v (ti) T x u (ti). Let = E[„u] : „>u ^vuik) € 5P+ and 

Y n(ti) = E [vu]:v>u r vu{ti)/J2[vu]:v>u u vu(ti) € ft for i = 1 , ..., n, the above likelihood 

for the baseline vector // arises from the model 


(3-2) Y ll(u) = n(ti) + e M(t .), independently for every i = l,...,n, 

with ~ N(0,1 Hence, the observation equation (3.2) and the state equa¬ 

tions (2.4), define a Gaussian linear state space model, which allows updating for // = 
{//(H), .. . fj>(t. n )} T , / i' = {//'(H), ... {t n )} T and z = {z(H), • • • z(t n )} T via the simula¬ 
tion smoother of Durbin and Koopman [2002] . This has a computational complexity of 
0(n ) and diffuse initialization at H, {//(H), /t'(H), z(H)} t ~ N 3 {0, diag(100,100,100)}. 

Step [3]: For every actor v = 1,..., V, we rely on similar derivations to block-sample the 
latent coordinate trajectories { x v h(ti ) : h = 1,..., H, ti = H, ..., t n }, along with their 
first derivatives {x' vh {ti) : h = 1,... ,H, ti = ti,...,t n } and the local instantaneous 
means (m^(H) : h = 1,..., H, ti = H,..., t n \. Specifically, given the baseline process, 
the Polya-gamma augmented data, and the remaining row processes {x u h(U) : u / 
v , h = 1,..., H, ti = t \,..., t n }, we obtain the following linear observation equations 
in x v (U) 


(3.3) 


Y x v (u) = Y { _ v ^(U)x v (ti) + e Xti(ti ), independently for every i = l,...,n, 


where Xt_ v \ (H) is the (V—1) x H coordinate matrix at time H with the utli row held out, 
Y x / t .) denotes the (V—l)xl vector of transformed data Y Xv r t .\ = dia.g{0(^) — 

0.5-ly_i -n(ti)Q( v )(ti)} and e Xv ( ti ) is the noise vector e Xv y.) ~ Ny_i(0, diag{H (l) )(n)} _1 ) 
for i = 1 ,... ,n. In the above notation is the (V — 1) x 1 vector containing the 

Polya-gamma augmented data uj vu (U) at time ti corresponding to all the pairs of actors 
having v as one of the two. The same holds for Y t .r v y As in step [2], the observation 
equation (3.3) along with state equations in (2.3) for v and h = 1,..., H form a lin¬ 
ear Gaussian state space model from which the processes x v h = {x v h(ti), ■ ■ ■ x v h{t n )} T , 
x 'vh = Wvh^i- ■ ■ x vh( t n)V and m vh = {m vh (ti),.. .m vh (t n )} T can be updated via 
simulation smoothing [Durbin and Koopman 2002] under the same diffuse initialization 
at h, {x vh (ti),x' vh (ti),m vh (ti)} T ~ N 3 {0, diag(100,100,100)} for each h = l,...,H. 

Step [4]: Letting a 2 ~ Inv-Ga(a^, b^) and a 2 ~ Inv-Ga(a z , b z ), the hyper-priors for the 
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noise variances in the states equations (2.4), their full conditionals are 
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Step [5]: Similarly, assuming the variances n.,, ^ r '~- j Inv Ga( fix , fox') and r ^~ l Iirv Ga( Oy t! , bm }, 
independently for each v = 1,..., V and h = 1,..., if, their full conditionals are 
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for each v = 1,..., V and h = 1,..., H. 

Step [6]: Finally, given the posterior samples for the baseline process /r = {//(U),... g,(t n )} T 
and x v h = {x v h(t \),... x„/ l (f n )} T , for each v = 1,..., V and h = 1,... H, obtain the 
posterior samples for the dynamic edge probabilities by applying (2.2) as follow 

^T V u(ti) = [1 + exp {-/J,(U) - X v (ti) T X u (ti)}] \ 

for each v = 2,..., V, u = 1,..., v — 1 and time t z = t\,..., t n . 

To choose H, we repeat the above algorithm for increasing H, stopping when there is no 
substantial improvement for in-sample edge prediction based on the area under the ROC curve 
(AUC). As in-sample predictive strategies may suffer from over-fitting issues, we additionally 
assess our choice of H by exploring out-of-sample prediction and forecasting performance. 


3.1. Forecasting and predicting. Forecasting a future network based on past data is par¬ 
ticularly appealing for our motivating application in facilitating the design of specific policies; 
for example, aimed at outbreak prevention. If an individual contracts a disease at time t n , 
forecasts at time f n +i are a key to understand which students are at risk of contagion as a 
result of face-to-face proximity interactions. 

Under our Bayesian paradigm, a strategy to obtain one-step-ahead forecasts of future edges 
is to rely on the expectation of the forecasted predictive distribution defined as 

E{34u(in+l) | , . . . , Yt n } = [Ey t , u (t n+1 ){Ttm(2n+l) | ' K vu(t n j r i)} | Y tl ,...,Y tn ] 

(3-4) = E[7r„ u (t n+ i) | Y tl ,..., Y tn ], 

for each v = 2, ..., V and u = 1,. . ., v — 1. Hence equation (3.4) simply requires the posterior 
mean of the edge probabilities at time t n +\. The Markovian property implied by our state 
equations in (2.3)-(2.4) provides a natural procedure to obtain these quantities along with the 
entire posterior distribution for 7r^ n (t n+ i), for each v = 2,... ,V, u = 1,... ,v — 1. Specifically, 
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according to (2.2)-(2.4) samples from the posterior of ir vu {t n+ i) can be simply obtained by 
applying the equation 

TTvu(tn+ 1 ) = (l + exp[-{//(f n ) + 5 n fl'(t n )} - {x v (t n ) + 6 n x' v (t n )} T {x u (t n ) + S n x' u (t n )}}) 1 , 

to the posterior samples of the latent states at time t n , for v = 2,..., V and u = 1,..., v — 1. 

Recalling our data set structure, beside forecasting contacts at the next time within the 
first day, it is additionally of interest to predict the whole network dynamics in the second 
day, based on the estimates from the previous day. In particular, letting y*(ti) denote the 
random matrix encoding presence or absence of contacts among pairs of actors at time U in 
the second day, we predict the edges in y*(U) by focusing on 


E {y* vu (U)\Y tl ,...,Y t J 

(3.5) 


E7T„ ll (U)[ET* u (t i ){^-i;u(^) I n vu(U)} | Ytli ■ ■ ■ jEfn] 

E [tt V u{U) | Ytv ■ ■,Y tn ], 


for each v = 2,..., V, u = 1,..., v — 1 and time tj, where the expectation in (3.5) coincides 
with the posterior mean of the edge probability trajectories. Equation (3.5) relies on the 
assumption that the dynamic contacts at the second day are governed by the same process 
underlying data in the first day. If we had data on multiple days, we could refine our model 
to include dynamic changes across days instead of just within a given day, but we avoid such 
complexity here and use the second day as a test set to evaluate predictive performance. 


3.2. Online updating. Once the model has been estimated on data Yq,..., Yf n . new con¬ 
tact networks Y tn+1 ,..., Y tn+n can stream in. In order to rapidly update policies, such as 
disease surveillance, it is important to have a fast online updating algorithm for the posterior 
of the edge probabilities n vu (t n+ i ),..., TT vu (t n+ n), v = 2,..., V, u = 1,..., v — 1, including 
information from new networks Yt n+1 ,..., Yj _, without the need to rerun posterior compu¬ 
tation for the whole data from t± to t n+ 

Our LADY network model is amenable to fast updating due to the latent Kalman filter 
formulation. Exploiting the posterior means and covariances of the latent states at time t n and 
the estimated noise variances in the state equation, our online updating algorithm efficiently 
cycles between steps [1]—[3] and [6] of the Gibbs sampler only for new data Yt n+1 ,.. .Yt n+n , 
with the simulation smoother in [2] and [3] initialized at t n +i using the predictive distribution 
from the Kalman filter. Specifically we initialize states {/r(t n+ i), ia'(t n +i), z(f n +i)} T at t n +\ 
in [2] by assuming {/r(f n+ i), fi'(t n+ i), z(t n+ i)} T are distributed according to 

N 3 {T n [E{/i(f n )}, Ejy (t n )}, Ej>(t n )}] T , T n Y^ n Tn + R n diag(d^5 n , a 2 z 8 n )R^}, 

where [Ej/i^r,)}, E{ia'(t n )}, E{Y(f n )}] T is the vector of posterior means for the latent states 
at time t n . r^ )n is their 3x3 posterior covariance matrix and dj), are the estimated noise 
variances using the initial data set from t\ to t n . A similar initialization is considered in step 
[3] for {x vh (t n+1 ),x' vh (t n+1 ),m vh (t n+1 )} T obtaining 

N 3 {T n [E{x tI / l (f n )}, Y{x v f l (t n )}, E{?n„/, (t n )}] ,T n T Xvhtn T n + R n diag(d^ vh 5 n , d mvh 6 n )R n }, 
for v = 1,..., V and h = 1,..., H. 

Although the algorithm fixes the hyperparameters corresponding to the noise variances in 
the state equations at their posterior means, these quantities are time-constant and hence 
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can be accurately estimated by borrowing information across the whole time window. It is 
however straightforward to modify the algorithm to update the posterior distribution also for 
these quantities given the latent states stored in the initial sampling from t\ to t n and the 
updated ones from t n+ \ to t n +n- This strategy may be useful when n is small. We found few 
differences between the two procedures in our simulations and hence prefer the first strategy. 

It is also worth noticing that our procedure does not update the previous TT vu {t \),..., 7 r vu (t n ), 
v = 2,... ,V, u = 1,..., v — 1 given new data Y tn+1 ,... Y tri+n , but focuses only on the pos¬ 
terior of Tr vu (t n+ i ),..., TT vu (t n+ n), v = 2,... ,V, u = 1,... ,v — 1. This may affect the ability 
of our procedures to properly propagate uncertainty and reduce performance in updating 
^vu{t n + i), • • •, 7 T vu (t n+ n), v = 2,..., V, u = 1,..., v — 1. To mitigate this issue, while main- 
taing computational scalability, we run online updating for data Y tn _ j ,■■■■, Yt„, Tj n+1 ,... Y tn+n 
instead of only Yt n+1 ,... Yj n+ -. We found this correction to improve performance even when 
a small number j of past networks is included along with new data. 

3.3. Model checking. Our LADY network formulation and related procedures fall within 
the class of latent variable modeling of dynamic networks. Although these methodologies are 
appealing in accommodating heterogenous structures and facilitate tractable inference, the 
types of higher-order dependencies included may be limited by the conditional independence 
assumption and the characterization of the latent variables. Although conditional indepen¬ 
dence may at first appear overly-restrictive, multivariate categorical data — such as a vector 
of edges — can be expressed as conditionally independent given a sufficient number of latent 
factors without imposing any assumptions on the joint distribution; see for example Dunson 
and Xing [2009] for theoretical results. 

To check the flexibility of our model, we develop approaches for assessing model adequacy. 
In the Bayesian literature, it is common to rely on diagnostics comparing the posterior pre¬ 
dictive distribution associated with the model to the observed data; refer to Gelman et al. 
[2013] for an overview. In our case the posterior predictive distribution is defined as 

P n 

w{y(t 1 ),...,y(t n )\Y tl ,...,Y tn } = n n pr {y vu (ti) I TT vu (ti)}dU{TT I Y tl ,.. .,Y tn }, 

i =1 [vu]:v>u 

where pr {y vu (ti) \ tt vu (ti)} is the Bernoulli probability mass function in (2.1), while the 
quantity II{7r | Yt t ,..., Yt n } denotes the joint posterior distribution for the trajectories of the 
edge probabilities given the observed data Y tl ,..., Y tn . It is straightforward to simulate from 
pr{T(ti), • • •, y{t n ) | Y tl ,..., Y tn } exploiting equation (2.1) along with posterior samples for 
7 T vu (ti), v = 2,..., V, u = 1,..., v— 1 and ti = 1 1 ,..., t n . Specifically, for each MCMC sample 
of TT V u{ti), w e simulate contacts among the corresponding pair of actors from conditionally 
independent Bernoulli random variables given ir vu (ti). 

Exploiting the samples from the posterior predictive distribution, we evaluate the perfor¬ 
mance of our model in accommodating specific topological structures observed from the data. 
We focus on the dynamic network density ^2\ vu ]. v>u W«(L )/{ V(V — l)/2}, the time-varying 
actor degree J2 U7 c v YvniU), for v = 1,..., V, and the dynamic homophily by class and gender 
measured by the assortativity coefficient; see Newman [2003], equation 2. 

When the interest is on disease surveillance and outbreak prevention, the dynamic network 
density is a key quantity in summarizing the frequency of contacts including those leading to 
potential contagion. Actors’ degrees are appealing in providing a measure of the number of 
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Fig 3. Upper panels: true edge probabilities — arranged in matrix form — for the regimes in the simulation; 
colors go from white to black as the probability goes from 0 to 1. Lower panels: graphical representation showing 
for every time which regime — i.e. true edge probabilities — is considered to simulate the data. 


subjects at risk of contagion if an individual contracts a disease at a certain time. Evolution 
of homophily structures across time and environmental conditions are of interest from a social 
science perspective; see for example Stehle et al. [2013] for a study on gender homophily in 
face-to-face contact networks from an averaged perspective. In assessing model adequacy, we 
compare these network summary measures computed from the observed data to the posterior 
predictive distribution of these quantities generated under the presumed model. If the model 
is not sufficiently flexible, we expect the observed measures to fall in the tails of their poste¬ 
rior predictive distributions. We perform also out-of-sample assessments to evaluate overall 
performance of the model in characterizing the data. 

4. Simulation study. We implement a simulation study to assess the performance of our 
LADY network model in estimating trajectories with varying smoothness, accommodating 
streaming data and predicting future networks. We consider dynamic networks with V = 30 
actors monitored for n = 50 equally spaced times from 1 1 = 0 to t $o = 15. The time varying 
edges Y t .[ vu ] are simulated from model (2.1) with edge probabilities evolving in time across five 
regimes mimicking — - in a simple version — possible scenarios associated with our face-to-face 
student interactions. Refer to Figure 3 for a representation of the true edge probabilities. 

Specifically we consider three classes comprising ten students each and define also a gender 
variable. There are 5 males and 5 females in each class, corresponding to the subsets of actors 
V m = {1,..., 5,16,..., 25} and V[ = {6,..., 15, 26,..., 30}, respectively. The first regime 
represents school hours and is characterized by high probability of contact between students 
in the same class, and low chance of face-to-face interaction among students in different 
classes. The second regime encodes high gender homophily, which may arise during the breaks 
before and after lunch times when all the students can interact; see also analyses in Stehle 
et al. [2013]. The third regime is characterized by the first two classes sharing the same room 
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— for school hours or breaks — and hence, beside high within class probabilities of contact, 
we observe also a moderately high chance of contact between students in the first two classes. 
Regime four represents a possible scenario we have observed in our data during lunch times 
and confirmed in Figure 10 of Stehle et al. [2011]. Specifically, students in the second class 
are equally divided in two groups, with one attending lunch with students in the first class 
and the other with those in the third class. Hence we observe two block structures, with an 
additional subset of the students having no contacts with the others in leaving the school 
for lunch. Regimes five and four define also networks during the end of the school day, with 
groups of students gathering in the same room and progressively leaving the school. Actors 
1, 4, 10 and 12 go home at time £ 42 , whereas actors 16, 20, 26 and 28 leave the school at t^. 

Although this generative mechanism represents a substantially simplified version of our 
complex data set, the basic underlying structures and the rapid changes in specific topological 
patterns are in line with those we expect in our application. Moreover, considering edge 
probabilities obtained under representations different than ( 2 . 2 ) and evolving in time across 
a regime-switching process instead of the state equations (2.3)—(2.4) has the additional benefit 
of providing a more effective validation of our LADY network methodology, as the true edge 
probability processes are not generated from our model. 

4.1. Posterior inference and model checking. In performing posterior inference, we choose 
moderately diffuse priors for the noise variances in the state equations by letting = a z = 
a x = am = bfj, = b z = b x = b m = 0.01, and run 5,000 Gibbs iterations discarding the first 
1,000. To learn H we consider our selection procedure by performing posterior computation 
for increasing H = 1,2,... and provide posterior inference for the model having H total 
latent coordinates such that AUC#+i — AUC# < 0.01. The AUC for the model with only the 
baseline process is 0.603, while those for formulations with H = 1 and H = 2 are 0.901 and 
0.943, respectively. Increasing the coordinates from H = 2 to H = 3 we found no substantial 
improvement with an AUC of 0.947, so H = 2 is chosen. 

Convergence is assessed by visual inspection of the traceplots for quantities of interest, and 
by the Gelrnan and Rubin [1992] potential scale reduction factors (PSRFs). These quanti¬ 
ties are obtained by comparing between and within sub-chains variances, after splitting each 
chain of interest in four consecutive sub-chains of length 1,000, after burn-in. The median 
of the PSRFs for the chains of the edge probabilities Tr vu (U), v = 2,..., V, u = 1,..., v — 1 
and ti = 1 1 ,..., t n , is 1.01, with 99% being less than 1.15, providing evidence of convergence. 
Similar results are obtained for the network measures of interest, including the dynamic ex¬ 
pected density EE H:B> J OT (t ! )/{F(F - l)/2}] = Eh^: V{y vu (U)}/{V(V - l)/2} = 

Y.[vu]:v>u 7T vu(t i )/{V(V - l)/2}, the time-varying expected actor degree E{^ u ^ v y vu (U)} = 
E { yvu(U )} = n vu(ti ) for each v = 1,..., V, and the dynamic expected homophily 

by class and gender. As the expectation of the assortativity coefficients is not analytically 
available as a function of the edge probabilities, we obtain posterior samples for the expected 
assortativity coefficients via Monte Carlo methods. Specifically, for each posterior sample of 
7r, JU (t ? ;), v = 2,..., V, u = 1 ,...,v — 1 and ti = we simulate 100 networks from 

(2.1) and obtain approximated samples from the posterior distribution of the dynamic ex¬ 
pected assortativity by class and gender, by computing these coefficients for the 100 simulated 
networks and averaging. 

As shown in the upper panels of Figures 4 and 5, enforcing local adaptivity in the time- 
varying trajectories of the edge probabilities while accommodating across-actor heterogeneity, 
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EXPECTED ASSORTATIVITY BY CLASS 




Fig 4. Upper panels: trajectory of the posterior mean (grey line) and point-wise 0.95 highest posterior density 
intervals (grey segments) for dynamic expected network summary measures covering network density, assorta- 
tivity by gender and by class; true trajectories are represented by the black line. Bottom panels: for the same 
summary measures, mean trajectory (grey line) and point-wise 0.95 predictive intervals (grey segments) ob¬ 
tained from the posterior predictive distribution; black dots represent the corresponding time-varying network 
measures computed from the simulated data. 


allows our model to capture rapid changes in the true expected measures of interest, including 
time-varying network density, homophily structures and actors’ degrees. Moreover, although 
we rely on a latent space representation which does not explicitly parameterize dependencies 
among edges, our model can accurately accommodate topological structures of interest char¬ 
acterizing the observed dynamic networks. This is highlighted in the bottom panels of Figures 
4 and 5, comparing network summary measures computed from the observed data with their 
posterior predictive distribution — consistent with the procedures outlined in Section 3.3. 
Almost all the observed quantities are inside the 0.95 posterior predictive intervals. 

4.2. Online updating, forecasting and predictive performance. Table 1 compares forecast¬ 
ing and predictive performance of our model to those associated with two selected competi¬ 
tors, for times from 1 45 to £ 50 - Specifically, we consider the Gaussian process dynamic network 
model developed by Durante and Dunson [2014] and the temporal ERGM (TERGM) proposed 
by Hanneke, Fu and Xing [2010]. Durante and Dunson [2014] rely on our model formulation 
(2.1)—(2.2) but do not allow varying smoothness over time. Hanneke, Fu and Xing [2010] 
TERGM is instead a substantially different model which explicitly accounts for the effect of 
topological structures in the model formulation, rather than considering latent variables. 

In performing posterior computation under Durante and Dunson 2014, we consider the 
same hyperparameter settings in their simulation study, fixing H = 2 — as in the LADY 
network model for this simulation — with the GP length scales = k x = 0.01. Moderate 
changes in the length scales provided comparable results. The TERGM is instead estimated 
via bootstrapped pseudolikelihood procedures [Desmarais and Cranmer 2012] exploiting the 
R packages btergm and xergm. In defining the linear predictor under the TERGM represen- 
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Fig 5. Upper panels: trajectory of the posterior mean (grey line) and point-wise 0.95 highest posterior density 
intervals (grey segments) for the dynamic expected degree of selected actors; true trajectories are represented by 
the black line. Bottom panels: for the same summary measures, mean trajectory (grey line) and point-wise 0.95 
predictive intervals (grey segments) obtained from the posterior predictive distribution; black dots represent the 
corresponding time-varying actors ’ degrees computed from the simulated data. 


tation we consider a p* ERGM specification with alternating fe-stars [Robins et al. 2007] and 
triangle effects to account for transitivity patterns. We additionally include gender and class 
variables via main and homophily effects — using functions nodefactor() and nodematchQ, 
respectively. Finally, we account for temporal dependence by including a stability term which 
measures the tendency of an edge — or non-edge — at time t t to be also observed — or not 
observed — at the next time tj+ 1 . The main effects of the actors’ covariates were not signifi¬ 
cant, hence we drop these predictors in assessing forecasting and predictive performance. We 
additionally hold out the triangle effect, as the inclusion of this term substantially reduced 
forecasting and predictive performance. We also attempted an actor-oriented model using the 
R package RSiena but found convergence issues for the parameters in the rate function. 

For each time t; = £ 44 ,... ,£49 forecasting performance is assessed by estimating the three 
different models using data from t\ to t;, and forecasting edges at time t; + ±. Forecasts under 
the GP dynamic network follow procedures outlined in Durante and Dunson [2014], Under 
the TERGM, forecasting of future networks proceeds via simulation methods using the gof () 
function in the R package ergm; see also Hunter et al. [2008]. For our LADY network model, 
we proceed by first updating the posterior distributions of the edge probabilities at t; using 
estimates from t\ to U-± according to procedures in Section 3.2 — with j = 5 — and 
then forecast edges at time t; + \ by applying the forecasting methods in Section 3.1 to the 
posterior distributions of the edge probabilities from the online updating. Joining online 
updating and forecasting is appealing in providing a fast strategy which avoids re-running 
posterior computation for the whole data set when a one-step-ahead forecast in required. 

In evaluating predictive performance we instead simulate new networks from the same 
mechanism considered to generate training data — see Figure 3 — and compare the areas 
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Table 1 

For our model and the two competitors, forecasting and predictive performance for times from £45 to £50 . 



£45 

£46 

£47 

£48 

£49 

£50 

Forecasting Performance monitored via areas under the ROC curve 

LADY 

GP 

TERGM 

0.913 [0.90,0.92] 

0.894 [0.88,0.91] 

0.877 [0.84,0.89] 

0.837 [0.83,0.85] 

0.803 [0.79,0.81] 

0.739 [0.57,0.84] 

0.916 [0.91,0.92] 

0.891 [0.88,0.90] 

0.847 [0.74,0.90] 

0.923 [0.92,0.93] 

0.921 [0.91,0.93] 

0.818 [0.73,0.88] 

0.936 [0.93,0.95] 

0.916 [0.90,0.93] 

0.848 [0.77,0.89] 

0.935 [0.93,0.94] 

0.923 [0.92,0.93] 

0.855 [0.79,0.91] 

Predictive Performance monitored via areas under the ROC curve 

LADY 

0.958 [0.95,0.97] 

0.958 [0.95,0.96] 

0.956 [0.95,0.96] 

0.954 [0.95,0.96] 

0.949 [0.94,0.96] 

0.936 [0.93,0.94] 

GP 

0.944 [0.93,0.96] 

0.940 [0.93,0.95] 

0.934 [0.92,0.95] 

0.923 [0.91,0.94] 

0.933 [0.92,0.94] 

0.925 [0.91,0.93] 

TERGM 

0.877 [0.82,0.92] 

0.873 [0.81,0.90] 

0.888 [0.83,0.92] 

0.877 [0.83,0.91] 

0.879 [0.80,0.92] 

0.891 [0.87,0.90] 


under the ROC curves when predicting their edges based on the estimates from the three 
competing methods — exploiting training data Y tl ,..., 1) 50 . Edge prediction under our LADY 
network model and Durante and Dunson [2014] GP dynamic network use equation (3.5). For 
TERGM we exploit again simulation procedures from the gof () function. To more reliably 
assess performance, we repeat the above forecasting and out-of-sample prediction exercises 
for 100 different simulated data sets and report in Table 1 the median along with the 0.25 
and 0.75 quantiles of the 100 areas under the ROC curves obtained for every time. 

As shown in Table 1 our procedure is characterized by improved forecasting and predictive 
performance compared to the GP dynamic network model and TERGM. Durante and Dunson 
[2014] accommodate heterogenous structures but assume time-constant smoothness. Hanneke, 
Fu and Xing [2010] explicitly account for higher-order dependencies but force the model 
parameters to be shared among actors and constant across time. These assumptions lead 
to reduced performance compared to our procedure which incorporates both across-actor 
heterogeneity and time-varying smoothness. These results additionally highlight the good 
performance of our online updating procedures. 

As expected, forecasting performance decreases at £46 since the models have no experience 
of sudden regime changes. However, it is interesting to notice how incorporating local adap¬ 
tivity provides rapid adjustments of the estimates to new regimes once they are observed, 
improving subsequent forecasts. The dynamic GP network model requires more times to adapt 
to new regimes due to the time-constant smoothness assumption. Reduced performance at 
£46 is not an issue when predicting new networks generated under the same mechanism, as the 
whole training data set ,..., Yj so already inform on regime changes. In the out-of-sample 
prediction exercise, performance depends on the flexibility of the model in accommodating 
rapid regime changes along with their associated network structures. 

Inference under our LADY network model takes ~ 75 minutes for posterior computation, 
~ 12 minutes for online updating and ~ 2 second for forecasting. The dynamic GP network 
model requires comparable time for forecasting but is substantially slower in performing pos¬ 
terior computation — ~ 500 minutes — due to the computational bottlenecks of the Gaussian 
processes. Estimation under TERGM is faster, but simulation methods for forecasting and 
predictions require more time. Our algorithms are run in a naive R (version 3.2.1) implemen¬ 
tation in a machine with one Intel Core i5 2.3GHz processor and 4GB ol RAM. Hence, there 
are significant margins to further improve computational time. 
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5. LADY networks for face-to-face interaction data. We apply our LADY network 
model outlined in Sections 2-3 to the face-to-face contact data Yq, • • •, Yt 51 described in 
Section 1.1, under the same settings of the simulation study, with H = 4. We select H = 4 as 
adding a further coordinate increases the area under the ROC curve by less than 0.01, while 
AUC 4 —AUC 3 > 0.01. In performing posterior inference we consider 5,000 Gibbs samples with 
a burn-in of 1,000. Convergence is assessed via visual inspection of selected traceplots and 
by Gelman and Rubin [1992] potential scale reduction factors for the quantities of interest, 
obtaining comparable results to those in the simulation study. Using a four-dimensional latent 
space produces an area under the ROC curve for in-sample edge prediction of 0.978. This is 
an interesting result in suggesting that the 120 x 120 time-varying adjacency matrices can 
be adequately characterized by collapsing information into a substantially lower-dimensional 
space. This insight is confirmed by results in Figures 6-7, highlighting accurate performance 
in modeling dynamic network structures of interest. 

5.1. Posterior inference and model checking in the application. The trajectory of the pos¬ 
terior mean for the expected network density in the upper left plot of Figure 6 provides an 
interesting overview of the overall dynamic contact behavior, consistent with school schedule 
and changing environments summarized in Figure 10 of Stehle et al. [2011]. It is interesting 
to note how the expected network density evolves on low values suggesting a sparse network, 
with our adaptive procedure additionally capturing a rapid increase in the chance of contact 
occurring during school breaks and the beginning or the end of lunch times for groups of 
students. According to the left plot in the bottom panel of Figure 6 , the posterior predic¬ 
tive distribution arising from our formulation is sufficiently flexible in accommodating the 
evolution of these summary measures. 

In studying dynamic homophily patterns, we investigate the posterior distribution of the 
time-varying expected assortativity coefficients by class and gender, computed for the 115 
students. We hold out teachers in homophily studies as we don’t have gender information for 
these actors and we are interested in social interactions among students — consistent with 
Stehle et al. [2013]. In investigating gender homophily, Stehle et al. [2013] focus on a single 
network obtained by aggregating the face-to-face interaction data that are observed in pre¬ 
selected nonconsecutive time windows when the occasions of contact are expected to have less 
environmental restrictions — i.e. break and lunch times. Although this is a reasonable proce¬ 
dure, information on spatial environments or events are not always available and the choice 
of aggregation intervals is not necessary unique. Moreover, investigating gender homophily 
for a single aggregated network provides only an averaged overview of a dynamic system. We 
instead study homophily structures as they evolve in time, and allow these quantities to be 
different in nonconsecutive time windows. 

Our results in the upper middle plot of Figure 6 partially confirm findings in Stehle et al. 
[2013], with the posterior distributions of the dynamic expected assortativity coefficients 
concentrated on positive values during breaks and lunch times. However the expected assor¬ 
tativity is higher during lunch compared to breaks, with the posterior for these coefficients 
including the value 0 during the last break. Hence Stehle et al. [2013] may over-estimate gen¬ 
der homophily in correspondence of breaks and under-estimate this property during lunches. 

The expected assortativity by class is always positive, with the posterior distributions con¬ 
centrating on substantially high values during school hours, when contacts are restricted by 
the spatial environments displayed in Figure 10 of Stehle et al. [2011]; refer to the upper 
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Fig 6. Upper panels: trajectory of the posterior mean (grey line) and point-wise 0.95 highest posterior density 
intervals (grey segments) for dynamic expected network summary measures covering network density, assorta- 
tivity by gender and assortativity by class. Bottom panels: for the same measures, mean trajectory (grey line) 
and point-wise 0.95 predictive intervals (grey segments) obtained from the posterior predictive distribution; 
black dots represent the corresponding time-varying network measures computed from the observed data. 


right plot of Figure 6. Model checking in the bottom middle and right plots of Figure 6 high¬ 
lights an overall good performance of our procedures in characterizing also these higher-order 
homophily structures. These are key results, provided that we embed a 120 x 120 dynamic net¬ 
work into a substantially lower-dimensional space made by four latent coordinates, without 
any further information on the dynamic effect of exogenous variables. Few issues are found 
in accommodating rapid changes in assortativity by class. A reason behind this slight lack 
of fit is that H = 4 latent coordinates may not be sufficient to characterize class homophily 
in specific time windows. It is still an active area of research to accommodate latent space 
dimensions which adaptively change as a function of time. Similarly to our procedure, most 
available contributions rely on time-constant space dimensions. Although a subset of the ob¬ 
served class assortativity coefficients are not within the 0.95 posterior predictive intervals, 
most of these values are contained in 0.99 posterior predictive intervals. Hence, we maintain 
H = 4 to avoid over-fitting. 

Beside accommodating global network structures, our procedure can flexibly characterize 
actor-specific connectivity measures of interest. According to the upper panels of Figure 7, 
incorporating across-actor heterogeneity and time-varying smoothness allows us to flexibly 
account for substantially different patterns and dynamic changes in expected actors’ degrees. 
As shown in the bottom panels of Figure 7, the posterior predictive distributions for the 
dynamic actors’ degrees arising from our estimates are characterized by a very accurate 
performance in accommodating these time-varying observed quantities. 

5.2. Online updating, forecasting and predictive performance in the application. Once the 
model has been estimated on data from t\ to , a new contact network Y;, can stream in 
along with the information that an individual — or a subset of them — has contracted a 
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Fig 7. Upper panels: trajectory of the posterior mean (grey line) and point-wise 0.95 highest posterior density 
intervals (grey segments) for the dynamic expected degree of selected actors. Bottom panels: for the same sum¬ 
mary measures, mean trajectory (grey line) and point-wise 0.95 predictive intervals (grey segments) obtained 
from the posterior predictive distribution; black dots represent the corresponding time-varying actors ’ degrees 
computed from the observed data. 


specific disease at t t . For outbreak prevention, it is fundamental to rapidly update estimates at 
time t; and forecast the contact network structures at the next time t;+\. Our LADY network 
model can suitably accomplish this task by online updating of the posterior distribution for 
the edge probabilities at t; exploiting strategies in Section 3.2 — with j = 5 — and then 
forecasting the posterior distribution of these probabilities at the next time t ; + 1 by applying 
equation in Section 3.1 to the MCMC samples from the online updating. Once these quantities 
are available, it is easy to derive the approximate forecasted predictive distribution at t; + \ 
along with related quantities of interest, such as its expected value for forecasting edges and 
the distribution of future topological structures. Figures 8, 9 and the upper left plot of Figure 
10 evaluate the performance of our joint online updating and forecasting procedure for each 
ti = tio, • • •, iso, under different perspectives. 

The left panels of Figure 8 compare the observed degrees for selected students — in the 
five different classes — with their mean and quantiles arising from the forecasted predictive 
distribution. Time-varying actors’ degrees are fundamental for disease surveillance and accu¬ 
rate forecasts for these quantities facilitates monitoring of the infectivity for each individual 
at future times. According to the left panels of Figure 8, our strategies provide in general 
a good performance in forecasting dynamic degrees. We observe, however, a slight tendency 
towards over-estimating these quantities. Although this bias is undesirable, for the sake of 
outbreak prevention, slightly over-estimating actors’ degrees suggests conservative policies. 

The right panels of Figure 8 add further insights by showing the proportion of the forecasted 
degree due to connections with students in the different classes. This provides a higher-level 
measure of which groups of individuals are at risk of contagion at L+i if a given individual 
contracts disease at fj, for each f* = fio, • • •, iso- Results further confirm our good performance 
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DEGREE ACTOR 48 IN CLASS 3A [forecasted predictive distr.] 
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FlG 8 . Left panels: for selected students in the five different classes, mean trajectory (grey line) and point-wise 
0.95 predictive intervals (grey segments) of their degree obtained from the one-step-ahead forecasted predictive 
distribution from tn to 1 51 ; black dots represents the corresponding time-varying actors ’ degrees computed from 
the observed data. Right panels: for the same students, barplots representing the time-varying mean of their 
degree obtained from the one-step-ahead forecasted predictive distribution from tu to tsi. Colors in the bars 
represent the proportion of the forecasted degree due to connections with each class. Dark red (first class), light 
red (second class), white (third class), light blue (fourth class), dark blue (fifth class), green (teachers). 
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From 10:30 to 12:00 From 12:10 to 14:00 From 14:10 to 17:10 



Fig 9. Weighted network visualization with weights obtained by averaging the mean of the one-step-ahead 
forecasted predictive distributions over three time windows. Edges are not displayed to facilitate graphical 
analysis. Actors’ positions are obtained applying the Fruchterman and Reingold [1991] force-directed placement 
algorithm, whereas their dimensions are proportional to the corresponding forecasted degree averaged over the 
three time windows. The colors indicate class membership: dark red (first- class), light red (second class), white 
(third class), light blue (fourth class), dark blue (fifth class), green (teachers). 


in forecasting heterogenous contact patterns and dynamic changes in actors degrees. Consis¬ 
tent with the findings on honrophily, contacts with students from the same class represent a 
high proportion of the forecasted dynamic degrees. This is more evident during school hours 
than breaks or lunch times where we observe more mixed patterns including increased across 
class contacts and students apparently leaving the school — such as for example actor 71. 

These findings are confirmed in Figure 9 providing a graphical representation of future 
networks with actors positions depending on the forecasted edges — according to equation 
(3.4) — averaged over three time windows of interest. Although we do not explicitly include 
environmental information, as shown in Figure 9 our procedure is sufficiently flexible to 
account for these structures from an unsupervised perspective. Consistent with Figure 10 in 
Stehle et al. [2011] we forecast evident community structures induced by class membership 
during the morning hours, with students in classes 1A, 3A and 4B being spatially closer than 
those in the remaining classes. This is consistent with classes 1A, 3A and 4B sharing the 
playground during the morning break according to Figure 10 in Stehle et al. [2011]. Lunch 
times are characterized by a sparse structure with two communities and a wide set of students 
having essentially no face-to-face contacts. The first community comprises students in classes 
1A, 2B and part of those in class 3A. The second includes actors from classes 4B, 5B and the 
remaining students from class 3A. Also these forecasts are consistent with the approximate 
school schedule presented in Stehle et al. [2011], with a subset of the students leaving the 
school during lunch and the remaining individuals sharing the canteen in two different groups 
at consecutive times. As expected, the results in the afternoon hours are similar to those in 
the morning, with a slightly more sparse structure due the fact that the students increasingly 
leave the school towards the end of the day. 

The upper left plot in Figure 10 assesses forecasting performance by showing for each 
time from to L 51 the AUCs when forecasting the edges in each network Y ti+1 , for ti+\ = 
in,... ,^ 51 , with the expectation of the forecasted predictive distribution — estimated from 
data Y tl ,...,Y ti under our online updating and forecasting routine. The AUCs evolve on 
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Fig 10. Upper panels: for times from tn to t$i in day one, forecasting performance for our LADY network 
model and the TERGM. Performance is assessed via the areas under the ROC curves when forecasting the 
future edges with the mean of their corresponding one-step-ahead forecasted predictive distributions. Bottom 
panels: predictive performance for our LADY network model and the TERGM. Performance is assessed via 
the areas under the ROC curves when predicting edges in the networks y t *,..., Y* n from day two, with the 
mean of their corresponding posterior predictive distribution estimated from the contact data in day one. 


high values, suggesting an overall good performance in forecasting of future edges, with 
more evident decrements in correspondence of the beginning, mid and end of the lunch 
time windows. These times are characterized by rapid variations in contact behavior due to 
students rapidly changing environments; refer to Figure 10 in Stehle et al. [2011]. Hence — 
recalling also insights in the simulation study — this decreased forecasting performance is 
reasonably related to the fact that the model has no experience of sudden regime changes. 
Although we face reduced forecasting performance in specific times, our procedure almost 
always improves forecasts from TERGM. Refer to the upper right plot in Figure 10. 

We conclude our analysis by evaluating the performance in predicting the edges of networks 
Yj *,..., Y f * during the second day, based on the information provided by the face-to-face 
contact networks in the first day Y^,... ,Yf n — consistent with discussion in Section 3.1. 
In particular, we study the areas under the ROC curves when predicting the edges in each 
network Y f * with the mean of their corresponding posterior predictive distribution estimated 
from the contact data in day one; see equation (3.5). As time fsi is not available in the second 
day, we assess out-of-sample predictive performance using data and estimates from t\ to t§ o- 
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Results are displayed in the bottom panels of Figure 10. 

We obtain a general good performance when predicting contacts in the second day, based on 
estimates from day one. More evident decrements are found in correspondence of lunch times 
and the afternoon break. This may suggest that the dynamic contact networks at the second 
day are governed by slightly different underlying patterns than those associated with the first 
day, for these time windows. Also in this case we almost always improve results from the 
TERGM. These results further confirm the need of procedures accounting for heterogenous 
and dynamic dependence patterns in such frameworks. 

6. Discussion. Although there has been an abundant interest in recent years in the 
study of dynamic networks, flexible methods for analyzing particular relational data have 
lagged behind the increasingly routine collection of such networks in several applied fields. 
Motivated by face-to-face dynamic contact networks, our methodology aims to take a further 
step towards addressing some of the current issues in dynamic network inference. 

Our model has been constructed using latent similarity measures defined by the dot product 
of actor-specific latent coordinate vectors, with entries evolving in continuous time via nested 
Gaussian process to flexibly incorporate time-varying smoothness patterns. Using matrix 
factorization procedures, our LADY network model can accommodate moderately large V, 
and considering a state space representation of the nGP, we further allow scaling to larger 
time grids. Adapting the Polya-gamma data augmentation strategy to our specific setting, 
we develop a simple and efficient Gibbs sampler for posterior computations, which utilizes 
standard results of Kalman filter for transformed Gaussian data. This further facilitates the 
development of forecasting, prediction and online updating procedures for fast inference. 

Simulation studies confirm the good performance of the developed methodologies and the 
application allows us to learn interesting patterns in global and actor-specific network struc¬ 
tures while confirming accurate forecasting and predictive performance. Recalling discussions 
in previous sections, there are several directions for future research. These include developing 
procedures to facilitate scaling to substantially larger sets of actors and improve model formu¬ 
lation to explicitly account for instantaneous and lagged covariates effects, without relying on 
potentially restrictive assumptions such as those typically encountered in dynamic ERGMs. 
Currently our model finds issues in scaling to very large networks and although our proce¬ 
dures have good inference and forecasting performance under an unsupervised perspective, 
careful inclusion of actors or edge covariates may further improve results. 

We conclude our discussion by highlighting further fields of application for our methodol¬ 
ogy. In fact, differently from TERGM and most of the available models specifically tailored 
for dynamic network inference, our LADY network model has a broader range of applica¬ 
bility also outside the temporal network field. In particular, our methods can be applied to 
network-valued data sets in which multiple observations of the same network are available 
along with a continuous predictor, instead of time. This is the case of neuroscience applica¬ 
tions providing a network of structural interconnections among a common set of brain regions 
for different individuals along with intelligence scores or personality traits — among others. 
Replacing time with one of these predictors allows us to recast our LADY network model 
within a network regression framework which facilitates learning and prediction of changes 
in brain structural connectivity patterns across a cognitive trait of interest. 
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